Detecting Adversarial Samples Using Density Ratio Estimates

نویسنده

  • Lovedeep Gondara
چکیده

Machine learning models, especially based on deep learning are used in everyday applications ranging from self driving cars to medical diagnostics. However, it is easy to trick such models using adversarial samples, indistinguishable from real samples to human eye, such samples can lead to incorrect classifications. Impact of adversarial samples is far-reaching and efficient detection of adversarial samples remains an open problem. In this paper we propose to use direct density ratio estimation as a model agnostic measure to detect adversarial samples, we empirically show that adversarial samples have different underlying probability densities compared to real samples. Our proposed method works well with colored and grayscale images, and with different adversarial sample generation methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Adversarial Samples from Artifacts

Deep neural networks (DNNs) are powerful nonlinear architectures that are known to be robust to random perturbations of the input. However, these models are vulnerable to adversarial perturbations—small input changes crafted explicitly to fool the model. In this paper, we ask whether a DNN can distinguish adversarial samples from their normal and noisy counterparts. We investigate model confide...

متن کامل

Calibrating Energy-based Generative Adversarial Networks

In this paper we propose equipping Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we develop a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimum. We deriv...

متن کامل

Dihedral angle prediction using generative adversarial networks

Several dihedral angles prediction methods were developed for protein structure prediction and their other applications. However, distribution of predicted angles would not be similar to that of real angles. To address this we employed generative adversarial networks (GAN) which showed promising results in image generation tasks. Generative adversarial networks are composed of two adversarially...

متن کامل

Attack and Defense of Dynamic Analysis-Based, Adversarial Neural Malware Classification Models

Recently researchers have proposed using deep learning-based systems for malware detection. Unfortunately, all deep learning classification systems are vulnerable to adversarial attacks where miscreants can avoid detection by the classification algorithm with very few perturbations of the input data. Previous work has studied adversarial attacks against static analysisbased malware classifiers ...

متن کامل

Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight

Deep reinforcement learning has shown promising results in learning control policies for complex sequential decision-making tasks. However, these neural network-based policies are known to be vulnerable to adversarial examples. This vulnerability poses a potentially serious threat to safety-critical systems such as autonomous vehicles. In this paper, we propose a defense mechanism to defend rei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1705.02224  شماره 

صفحات  -

تاریخ انتشار 2017